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Abstract. We present a parametric deterministic formulation of Bayesian inverse 
problems with input parameter from infinite dimensional, separable Banach spaces. 
In this formulation, the forward problems are parametric, deterministic elliptic partial 
differential equations, and the inverse problem is to determine the unknown, parametric 
deterministic coefhcients from noisy observations comprising linear functionals of the 
solution. 

We prove a generalized polynomial chaos representation of the posterior density 
with respect to the prior measure, given noisy observational data. We analyze the 
sparsity of the posterior density in terms of the summability of the input data's 
coefficient sequence. The first step in this process is to estimate the fiuctuations in the 
prior. We exhibit sufficient conditions on the prior model in order for approximations 
of the posterior density to converge at a given algebraic rate, in terms of the number N 
of unknowns appearing in the parameteric representation of the prior measure. Similar 
sparsity and approximation results are also exhibited for the solution and covariance 
of the elliptic partial differential equation under the posterior. These results then form 
the basis for efficient uncertainty quantification, in the presence of data with noise. 



1. Introduction 



Quantification of tlie uncertainty in predictions made by pliysical models, resulting 
from uncertainty in the input parameters to those models, is of increasing importance 
in many areas of science and engineering. Considerable effort has been devoted to 
developing numerical methods for this task. The most straightforward approach is 
to sample the uncertain system responses by Monte Carlo simulations. These have 
the advantage of being conceptually straightforward, but are constrained in terms of 
efficiency by their rate of convergence (A^ number of samples). In the 1980s 

the engineering community started to develop new approaches to the problem via 
parametric representation of the probability space for the input parameters [231 121] 
based on the pioneering ideas of Wiener [27]. The use of sparse spectral approximation 
techniques [26| |22] opens the avenue towards algorithms for computational quantification 
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of uncertainty which beat the asymptotic complexity of Monte Carlo (MC) methods, as 
measured by computational cost per unit error in predicted uncertainty. 

Most of the work in this area has been confined to the use of probability 
models on the input parameters which are very simple, albeit leading to high 
dimensional parametric representations. Typically the randomness is described by a 
(possibly countably infinite) set of independent random variables representing uncertain 
coefficients in parametric expansions of input data, typically with known closed 
form Lebesgue densities. In many applications, such uncertainty in parameters is 
compensated for by (possibly noisy) observations, leading to an inverse problem. One 
approach to such inverse problems is via the techniques of optimal control [2]; however 
this does not lead naturally to quantification of uncertainty. A Bayesian approach to 
the inverse problem [Ml [25] allows the observations to map a possibly simple prior 
probability distribution on the input parameters into a posterior distribution. This 
posterior distribution is typically much more complicated than the prior, involving 
many correlations and without a useable closed form. The posterior distribution 
completely quantifies the uncertainty in the system's response, under given prior and 
structural assumptions on the system and given observational data. It allows, in 
particular, the Bayesian statistical estimation of unknown system parameters and 
responses by integration with respect to the posterior measure, which is of interest 
in many applications. 

Monte Carlo Markov chain (MCMC) methods can be used to probe this posterior 
probability distribution. This allows for computation of estimates of uncertain system 
responses conditioned on given observation data by means of approximate integration. 
However, these methods suffer from the same limits on computational complexity as 
straightforward Monte Carlo methods. It is hence of interest to investigate whether 
sparse approximation techniques can be used to approximate the posterior density 
and conditional expectations given the data. In this pqper we study this question 
in the context of a model elliptic inverse problem. Elliptic problems with random 
coefficients have provided an important class of model problems for the uncertainty 
quantification community, see, for example, [U |22] and the references therein. In the 
context of inverse problems and noisy observational data, the corresponding elliptic 
problem arises naturally in the study of groundwater fiow (see [I9]) where hydro logists 
wish to determine the transmissivity (diffusion coefficient) from the head (solution of 
the elliptic PDE). The elliptic inverse problem hence provides natural model problem 
within which to study sparse representations of the posterior distribution. 

In Section [2] we recall the Bayesian setting for inverse problems from [25] , stating 
and proving an infinite dimensional Bayes rule adapted to our inverse problem setting in 
Theorem 12.11 Section |3] formulates the forward and inverse elliptic problem of interest, 
culminating in an application of Bayes rule in Theorem 13.41 The prior model is built on 
the work in [3l [6] in which the diffusion coefficient is represented parametrically via an 
infinite sum of functions, each with an independent uniformly distributed and compactly 
supported random variable as coefficient. Once we have shown that the posterior 
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measure is well-defined and absolutely continuous with respect to the prior, we proceed 
to study the analytic dependence of the posterior density in Section HI culminating 
in Theorems 14.21 and 14.81 In Section |5] we show how this parametric representation, 
and analyticity, may be employed to develop sparse polynomial chaos representations 
of the posterior density, and the key Theorem 15.91 summarizes the achievable rates 
of convergence. In Section [6] we study a variety of practical issues that arise in 
attempting to exploit the sparse polynomial representations as realizable algorithms for 
the evaluation of (posterior) expectations. Section [7] contains our concluding remarks 
and, in particular, a discussion of the computational complexity of the new methodology, 
in comparison with that for Monte Carlo based methods. 

Throughout we concentrate on the posterior density itself. However we also provide 
analysis related to the analyticity (and hence sparse polynomial representation) of 
various functions of the unknown input, in particular the solution to the forward elliptic 
problem, and tensor products of this function. For the above class of elliptic model 
problems, we prove that for given data, there exist sparse, iV-term gpc ("generalized 
polynomial chaos") approximations of this expectation with respect to the posterior 
(which is written as a density reweighted expectation with respect to the prior) which 
converge at the same rates afforded by best A^-term gpc approximations of the system 
response to uncertain, parametric inputs. Moreover, our analysis implies that the set A^v 
of the "active" gpc-coefficients is identical to the set A at of indices of a best A^-term 
approximation of the system's response. It was shown in [6l [7] that these rates are, in 
turn, completely determined by the the decay rates of the input's fluctuation expansions. 
We thus show that the machinery developed to describe gpc approximations of uncertain 
system response may be employed to study the more involved Bayesian inverse problem 
where the uncertainty is conditioned on observational data. Numerical algorithms which 
achieve the optimal complexity implied by the sparse approximations, and numerical 
results demonstrating this will be given in our forthcoming work [1]. 

2. Bayesian Inverse Problems 

Let G : X ^ R denote a "forward" map from some separable Banach space X of 
unknown parameters into another separable Banach space R of responses. We equip 
X and R with norms || ■ ||x and with || • H/j, respectively. In addition, we are given 
0{-) : -R — M.^ denoting a bounded linear observation operator on the space R of system 
responses, which belong to the dual space R* of the space R of system responses. We 
assume that the data is finite so that A' < oo, and equip with the Euclidean norm, 
denoted by | ■ |. 

We wish to determine the unknown data m G X from the noisy observations 

5 = 0{Giu))+r] (1) 

where rj G M.^ represents the noise. We assume that realization of the noise process is 
not known to us, but that it is a draw from the Gaussian measure A/'(0,r), for some 
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positive (known) covariance operator F on M^. If we define Q : X ^ hy Q = O oG 
then we may write the equation for the observations as 

5 = g{u) + r/. (2) 

We define the least squares functional (also referred to as "potential" in what follows) 
$ : X X -> M by 

^u-5) = \\5-g{u)\l (3) 
where | ■ |r = |r~^ ■ I so that 

^{u-6) = \{{6-G{u)yT~\6-G{u))) . 

In [25] it is shown that, under appropriate conditions on the forward and observation 
model Q and the prior measure on m, the posterior distribution on u is absolutely 
continuous with respect to the prior with Radon-Nikodym derivative given by an 
infinite dimensional version of Bayes rule. Posterior uncertainty is then determined 
by integration of suitably chosen functions against this posterior. At the heart of the 
deterministic approach proposed and analyzed here lies the reformulation of the forward 
problem with stochastic input data as an infinite dimensional, parametric deterministic 
problem. We are thus interested in expressing the posterior distribution in terms of a 
parametric representation of the unknown coefficient function u. To this end we assume 
that, under the prior distribution, this function admits a parametric representation of 
the form 

u = a + Y^ y^^jj (4) 
iejf 

where y = {yj}j<zj is an i.i.d sequence of real-valued random variables yj ~ W(— 1, 1) and 
a and the ipj are elements of X. Here and throughout, J denotes a finite or countably 
infinite index set, i.e. either J = {1,2, J} or J = N. All assertions proved in the 
present paper hold in either case, and all bounds are in particular independent of the 
number J of parameters. 

To derive the parametric expression of the prior measure /iq on y we denote by 

u = {-i,iy 

the space of all sequences (yj)j£s of real numbers yj G (—1,1). Denoting the sub a- 
algebra of Borel subsets on M which are also subsets of (—1,1) by i3^(— 1, 1), the pair 

{U,B)= (g)S^(-l,l)) (5) 

is a measurable space. We equip (f/, B) with the uniform probability measure 

M) := (g) ^ (6) 
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which corresponds to bounded intervals for the possibly countably many uncertain 
parameters. Since the countable product of probability measures is again a probability 
measure, (f/, B, /xq) is a probability space. We assume in what follows that the prior 
measure on the uncertain input data, parametrized in the form (|4]), is fj,o{dy). We add 
in passing that unbounded parameter ranges as arise, e.g., in lognormal random diffusion 
coefficients in models for subsurface flow [19] , can be treated by the techniques developed 
here, at the expense of additional technicalities. We refer to [1] for details as well as for 
numerical experiments. 

Define E:U^R^hy 

E{y) = G{u) (7) 

In the following we view U as a. bounded subset in £°°(J), the Banach space of bounded 
sequences, and thereby introduce a notion of continuity in U . 

Theorem 2.1. Assume that 3:0'^ M.^ is bounded and continuous. Then fi^{dy), the 
distribution of y given 6, is absolutely continuous with respect to fiQ{dy). Furthermore, 



then 



where 



e(y) = exp(-$(n;5)) , (8) 



"^^'-(y) = ^Qiy) (9) 



dfXQ Z 



Z= / e{y)M). (10) 
Ju 

Proof. Let denote the probability measure on U xR^ defined by i^o{dy) ® vr(rf(5), 
where vr is the Gaussian measure A/'(0,r). Now define a second probability measure u 
on U X M^^ as follows. First we specify the distribution of 6 given y to be A/'(H(y),r). 
Since : U — ?■ R^' is continuous and fJ^oiU) = 1 we deduce that H is /iq measurable. 
Hence we may complete the definition of u by specifying that y is distributed according 
to //o- By construction, and ignoring the constant of proportionality which depends only 
on SM 

^iy,5)^e{y). 
duo 

From the boundedness of S on [7 we deduce that is bounded from below on U by 
> and hence that 



Z> j Ooi^oidy) = eo>0 
Ju 



since fio{U) = 1. Noting that, under vq, y and 5 are independent. Lemma 5.3 in [I2] 
gives the desired result. □ 

I 0(y) is also a function of 5 but we suppress this for economy of notation. 
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We assume that we wish to compute the expectation of a function : X — )■ 5", for 
some Banach space 5*. With 0, we associate the parametric mapping 

^(y) = exp(-<l>(n;5))0(n) -.U^S. (11) 

From \I/ we define 

Z'= I ■^{y)ii^{dy) e S (12) 
Ju 

so that the expectation of interest is given by Z' /Z G S. Thus our aim is to approximate 
Z' and Z. Typical choices for in apphcations might be 0(u) = G{u), the response of 
the system, or 

0(u) := {G{u)f"'^ := G{u)(^ . . .®G{u\ eS = i?^'^) := R ^ ■ ■ ■ R . (13) 

m times 1^ times 

In particular the choices 0(m) = G'(u) and </>(«) = G'(n) ® G'(n) together facilitate 
computation of the mean and covariance of the response. 

In the next sections we will study the elliptic problem and deduce, from known 
results concerning the parametric forward problem, the joint analyticity of the posterior 
density Q{y), and also "^{y), as a function of the parameter vector y ^ U. From these 
results, we deduce sharp estimates on size of domain of analyticity of Q{y) (and "^{y)) 
as a function of each coordinate yj, j G N. We concentrate on the concrete choice of 
defined by (fT3|) . and often the case p = 1. The analysis can be extended to other 
choices of 



3. Model Parametric Elliptic Problem 

3.1. Function Spaces 

Our aim is to study the inverse problem of determining the diffusion coefficient u of an 
elliptic PDE from observation of a finite set of noisy linear functionals of the solution 
p, given u. 

Let D be a bounded Lipschitz domain in M"^, c? = 1, 2 or 3, with Lipschitz boundary 
dD. Let further (^H, (-, ■), || ■ \\j denote the Hilbert space L^{D) which we will identify 
throughout with its dual space, i.e. H ^ H*. 

We define also the space V of variational solutions of the forward problem: 
specifically, we let (v, (V-, V-), || ■ denote the Hilbert space Hq{D) (everything that 
follows will hold for rather general, elliptic problems with affine parameter dependence 
and "energy" space V). The dual space V* of all continuous, linear functionals on V 
is isomorphic to the Banach space H~^{D) which we equip with the dual norm to V , 
denoted || ■ ||_i. We shall assume for the (deterministic) data / G V* . 
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3.2. Forward Problem 

In the bounded Lipschitz domain we consider the following elliptic PDE: 

- V ■ (nVp) = / in p = in dD. (14) 

Given data u G L°°{D), a weak solution of (fT4|) for any / G V* is a function p G 
which satisfies 



/ u{x)Vp{x) ■ \/q{x)dx =v {q, f)v* ^oi all q eV . 
Jd 

Here v{',')v* denotes the dual pairing between elements of V and V*. 
For the well-posedness of the forward problem, we shall work under 



(15) 



Assumption 3.1. There exist constants < a„,N < a^ix < oo so that 

< Amin < u{x) < a„,„ < oo, X E D, (16) 

Under Assumption 13. H the Lax-Milgram Lemma ensures the existence and 
uniqueness of the response p of (fT5i) . Thus, in the notation of the previous section, 
R = V and G{u) = p. Moreover, this variational solution satisfies the a-priori estimate 



\\G{u)\\y = \\p\\y < ^ . (17) 

We assume that the observation function O : V ^ comprises K linear functionals 
Ok E V*, k = 1, . . . , K . In the notation of the previous section, we denote by A = L°°{D) 
the Banach space in which the unknown input parameter u takes values. It follows that 

\g{u)\<Mki(j2\\okrv.y^. (is) 



3.3. Structural Assumptions on Diffusion Coefficient 

As discussed in section [2] we introduce a parametric representation of the random 
input parameter u via an afiine representation with respect to which means that 
the parameters Uj are the coefficients of the function u in the formal series expansion 

u{x,y) = a{x) + ^yjipj{x), x E D, (19) 

where d E L°°{D) and {■ipj}j^^ C L°°{D). We are interested in the effect of 
approximating the solutions input parameter u{x,y), by truncation of the series 
expansion (|T9|) in the case J = N, and on the corresponding effect on the forward 
(resp. observational) map G{u{-)) (resp. Q{u{-))) to the family of elliptic equations 
with the above input parameters. In the decomposition (fT9l) . we have the choice to 
either normalize the basis (e.g., assume they all have norm one in some space) or to 
normalize the parameters. It is more convenient for us to do the latter. This leads us 
to the following assumptions which shall be made throughout: 

i) For all j E S : ipj E L°°{D) and ipj{x) is defined for all x E D, 
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ii) y = (yi,y2,...)ef/=[-l,lF, (20) 
i.e. the parameter vector y in f|T9|l belongs to the unit ball of the sequence space 
^°°(J), 

iii) for each y) to be considered, ( |T9l) holds for every x & D and every y E U. 

We will, on occasion, use ( IT9|) with J C N, as well as with J = N (in the latter case 
the additional Assumption 13.21 below has to be imposed). In either case, we will work 
throughout under the assumption that the ellipticity condition (JT6ll holds uniformly for 
yeU. 

Uniform Ellipticity Assumption: there exist < < a^iAx < oo such that for all 
X G -D and for all y E U 

< a„™ < u{x, y) < a,,,„ < oo. (21) 

We refer to assumption (12T1) as UEA(aM,N, a^^x) in the following. In particular, 
UEA(aM,N, a„Ax) implies a„„ < a(x) < a^^x for all x E D, since we can choose yj = for 
all j G N. Also observe that the validity of the lower and upper inequality in ( l2T|) for 
all y E U are respectively equivalent to the conditions that 

J2\i^j{x)\<a{x)-a,,,^, xeD, (22) 

and 

^^lipjix)] < a„^^- a{x), xeD. (23) 

We shall require in what follows a quantitative control of the relative size of the 
fluctuations in the representation (fT9l) . To this end, we shall impose 

Assumption 3.2. The functions a and ipj in ( fT9l) satisfy 



_ , K 

i6j 



with a„,„ = miUj-g^) a(x) > and k > 0. 

Assumption 13.11 is then satisfied by choosing 



K 



--iviiiN - --iviiiN -1 , "'^MIN -1 , '^MIN* (24) 

1 + K 1 + K 

3.4- Inverse Problem 

We start by proving that the forward maps G : X V and Q : X ^ M.^ are Lipschitz. 

Lemma 3.3. If p and p are solutions of (ITSl) with the same right hand side f and with 
coefficients u and u, respectively, and if these coefficients both satisfy Assumption \ 3.1\ 
then the forward solution map u ^ p = G{u) is Lipschitz as a mapping from X into V 
with Lipschitz constant defined by 



\p -p\\v < — 2 — II" " ^||l=°(D)- (25) 



a2 

MIN 
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Moreover the forward solution map can he composed with the observation operator to 
prove that the map u ^ Q (u) is Lipschitz as a mapping from X into M^' with Lipschitz 
constant defined by 

\g{u) - g{u)\ < ^^(^^ \\ok\\l^r^\\u - uh-iD). (26) 

Proof. Subtracting the variational formulations for p and p, we find that for all 

qeV, 

= / uVp-Vqdx— / u\/p-Vqdx= / u{\/p — Vp) ■ Vqdx + / {u — u)\/p ■ Vqdx. 
Jd Jd Jd Jd 

Therefore w = p—p is the solution of u'Vw-'Vq = L{q) where L[v) := J^(n— n) Vp- Vf . 
Hence 

II II ^\\3xi 

\\w\\v < , 

and we obtain (125|1 since it follows from ( IT7|) that 

11/11 V* 

= max \L{v)\ < \\u - u\\l^(^d)\\p\\v < \\u - u\\l^{d) —■ 

lbllv=l «MIN 

Lipschitz continuity of Q = O o G : X ^ is immediate since O comprises the K 
linear functional Ok- Thus ( l25l) implies ( 126|) . □ 
The next result may be deduced in a straightforward fashion from the preceding 
analysis: 

Theorem 3.4. Under the UEA(aMiK, Amax) (in'd Assumption \3.S\ it follows that the 
posterior measure fi^{dy) on y given 6 is absolutely continuous with respect to the prior 
measure fio{dy) with Radon-Nikodym derivative given by and Qj. 

Proof. This is a straightforward consequence of Theorem 12.11 provided that we show 
boundedness and continuity of S : f/ — given by (171). Boundedness follows from 
f|T8|) . together with the boundedness of ||ofc||y., under UEA(a„,N, a„Ax). Let u,u denote 
two diffusion coefficients generated by two parametric sequences y, y in U. Then, by 
(!26|) and Assumption I3.2[ 

The result follows. □ 
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4. Complex Extension of the Elliptic Problem 

As indicated above, one main technical objective will consist in proving analyticity of 
the posterior density Qiy) with respect to the (possibly countably many) parameters 
?/ G f/ in ( fT9|) defining the prior, and to obtain bounds on the supremum of 6 over 
the maximal domains in C into which Q{y) can be continued analytically. Our key 
ingredients for getting such estimates rely on complex analysis. 

It is well-known that the existence theory for the forward problem (JHj) extends to 
the case where the coefficient function u{x) takes values in C. In this case, the ellipticity 
Assumption 13.11 should be replaced by the assumption that 

< cImin < 3ft(u(x)) < \u{x)\ < a,,,„ < oo, x E D. (27) 

and all the above results remain valid with Sobolev spaces understood as spaces of 
complex valued functions. Throughout what follows, we shall frequently pass to spaces 
of complex valued functions, without distinguishing these notationally. It will always 
be clear from the context which coefficient field is implied. 



4.I. Notation and Assumptions 

We extend the definition of u{x,y) to u{x,z) for the complex variable z = (-ZjOjgji (by 
using the Zj instead of yj in the definition of u by ( fT9l) ) where each zj has modulus less 
than or equal to 1. Therefore z belongs to the polydisc 

U ■= (^{zj e C : \zj\ < 1} C . (28) 

Note that U dlA. Using fl22|) and fl23|) . when the functions a and ipj are real valued, 
condition UEA(a„,N, Omax) implies that for all x G -D and z eU, 

< a„™ < 5R(n(x, z)) < \u{x, z)\ < 2a„,, , (29) 

and therefore the corresponding solution p{z) is well defined in V for all z G W by 
the Lax-Milgram theorem for sesquilinear forms. More generally, we may consider an 
expansion of the form, 

u{x, z) = a + Zj-ipj 

where a and ipj are complex valued functions and replace UEA(a„„, a^Ax) by the 
following, complex- valued counterpart: 

Uniform Ellipticity Assumption in C : there exist < < a^^x < oo such that 
for all X E D and all z eU 

< CImin < ^{u{x, z)) < \u{x, z)\ < a^^x < OO. (30) 

We refer to ([30]) as UEAC(aj„^., «„„)• 
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^.2. Domains of holomorphy 

The condition UEAC(a„,N, Omax) implies that the forward solution map z f-> p{z) is 
strongly holomorphic as a l^— valued function which is uniformly bounded in certain 
domains larger than U. For < r < 20^^^ < oo we define the open set 

Ar = {z eC^ : r < 3?(m(x, z)) < \u{x, z)\ < 2a,,^^ for every x e D} G 

Under UEAC(a^„K, a^^x), for every < r < holds U C Ar- 

According to the Lax-Milgram theorem, for every z E Ar there exists a unique 
solution p{z) G V of the variational problem: given f E V*, for every z E Ar, find 
p E V such that 

a{z;p,q) = U\q) Vg G . (32) 
Here the sesquilinear form a{z] -, ■) is defined as 

a{z;p,q)= / u{x , z)\/ p ■ V qdx ^p,qEV . (33) 
Jd 

We next show that the analytic continuation of the parametric solution piy) to the 
domain Ar is the unique solution p{z) of (132|) which satisfies the a-priori estimate 



sup|b(z)|k<^^. (34) 

The first step of our analysis is to establish strong holomorphy of the forward solution 
map z I—)- p{z) in ( |32l) with respect to the countably many variables Zj at any point 
z E Ar- This follows from the observation that the function p{z) is the solution to the 
operator equation A{z)p{z) = f, where the operator A{z) G C{V, V*) depends in an 
afiine manner on each variable Zj. To prepare the argument for proving holomorphy of 
the functionals $ and G appearing in ([8]), (ITT!) we give a direct proof. 

Using Lemma 13.31 we have proved by means of a difference quotient argument given 
in [7j, Lemma [4.11 ahead. Lemma [4.11 together with Hartogs' Theorem (see, e.g., [T3] ) 
and the separability of V, implies strong holomorphy of p{z) as a valued function on 
Ar, stated as Theorem 14.21 below. The proof of this theorem can also be found in [7j; the 
result will also be obtained as a corollary of the analyticity results for the functionals 
\[^, 6 proved below. 

Lemma 4.1. At any z G Ar, the function z p{z) admits a complex derivative 
dzjP{z) G V with respect to each variable zj. This derivative is the weak solution of the 
problem: given z E Ar, find dzjP{z) E V such that 

a{z; dzjP{z), q) = LQ{q) := — / iljjVp{z) ■ Vqdx , for all q E V. (35) 
^ Jd 

Theorem 4.2. Under UEAC(aM,N, a„Ax) for any < r < a^ij, the solution p{z) = 
G{u{z)) of the parametric forward problem is holomorphic as a V -valued function in Ar 
and the a priori estimate ([34l) holds. 



Sparse Approximation of Inverse Problems 



12 



We remark that Ar also contains certain polydiscs: for any sequence p := {pj)j>i 
of positive radii we define the polydisc 

= (8)i^i ^ ^ • l^il ^ P^} = {zjeC:z = {Zj),a ; l%l < pA C .(36) 

We say that a sequence p = {pj)j>i of radii is r-admissihle if and only if for every x E D 
5^p,|^,(x)|<3ft(a(x))-r. (37) 

If the sequence p is r-admissible, then the polydisc Up is contained in Ar since on the 
one hand for all z ElAp and for almost every x E D 

and on the other hand, if for every x E D 

\u{x, z)\ < \d{x) \ + \zjipj{x)\ < \d{x) \ + 3?(a(x)) — r < 2|a(x)| < 2a„Ax • 

Here we used \d{x)\ < a^Ax which follows from UEAC(aM,x, a^^x)- 

Similar to fl22l) , the validity of the lower inequality in fl30|) for all z eU is equivalent 
to the condition that 

XI < ^(«(^)) - ^ e ^- (38) 

This shows that the constant sequence pj = 1 is r-admissible for all < r < a^iN. 

Remark 4.3. For < r < there exist r-admissible sequences such that pj > 1 for 
all j > 1, i.e. such that the polydisc Up is strictly larger than U in every variable. This 
will be exploited systematically below in the derivation of approximation bounds. □ 

4-3. Holomorphy of response functionals 

We next show that, for given data 5, the functionals G{-), $(n(-);5) and 6(-) depend 
holomorphically on the parameter vector z G C^, on polydiscs Up as in ( l36l) for suitable 
r-admissible sequences of semiaxes p. Our general strategy for proving this will be 
analogous to the argument for establishing analyticity of the map z i— >■ G{u{z)) as a 
y- valued function. 

We now extend Theorem 14.21 from the solution of the elliptic PDE to the posterior 
density, and related quantities required to define expectations under the posterior, 
culminating in Theorem 14.81 and Corollary 14.91 We achieve this through a sequence 
of lemmas which we now derive. 

The following lemma is simply a complexification of (fT8|) and ( 126|) . It implies bounds 
on Q and its Lipschitz constant in the covariance weighted norm. 
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Lemma 4.4. Under UEAC(aM,K, a,nx), for every f & V* = H ^{D) and for every 
O(-) e (V*)* ~ 1/ ^ F = M-^' holds 

mu)\ <^(X:i|o.||^0^ (39) 



k=l 

K 



MIN j^^^ 



To be concrete we concentrate in the next lemma on computing the expected value 
of the pressure p = G{u) G V under the posterior measure. To this end we define \l/ 
with ip as in f|T3|) with m = 1. We start by considering the case of a single parameter. 



Lemma 4.5. Let J = {1} and take (j) = G : U V . With u{x,y) as in (jl]), under 
UEAC(au™5 Omax); the functions : [—1, 1] — V and 6 : [—1, 1] — )■ M and the potential 
■); 6) defined by f|TT]) . (|8]) and (|3]) respectively, may be extended to functions which 
are strongly holomorphic on the strip {y + iz : \y\ < r/n} for any r G (k, 1). 

Proof. We view H, V and X = L°°{D) as Banach spaces over C. We extend the equation 
( !T9|) to complex coefficients u{x, z) = Re(a{x) + z^|:{x)) = a{x) +yip{x) since z = y + iC,. 
Note that a + zip is holomorphic in z since it is linear. Since He^a + ztfj) = a + ytp > a„ij,, 
if follows that, for all ( = lm{z), 



Re / u{x)\Vp{x) —Vp{x)\ dx > a^,,^\\p — pWy. 
Jd 

We prove that the mappings \1/ and 6 are holomorphic by studying the properties 
of G{a + zip) and $(a + zip) as functions of 2; G C. Let /?, G C with \h\ < e 1 . We 
show that 

lim h~^{p{z + h) — p{z)) 

|/l|-5-0 

exists in V (strong holomorphy). Note first that dzU = ip. Now consider p. We have 
i(p(z + h) - p{z)) = i (^G(a + {z + h)ip) - G(a + zip)^ =: r . 



By Lemma 13731 we deduce that 



|r||y < ' "2 '^'^^ Ml^iD) ■ 



MIN 



From this it follows that there is a weakly convergent subsequence in as — 0. We 
proceed to deduce existence of a strong limit. To this end, we introduce the sesquilinear 
form 



b{p, q) = uVpVqdx . 
Jd 



Then 

b{G{u),q) = {f,q) WqeV. 
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For a coefficient function u as in (1191) . the form ■) is equal to the parametric 
sesquihnear form a{z]p,q) defined in (l33l) . 

Note that for z = a + yip G M and for real- valued arguments p and q, the parametric 
sesquilinear form a{z;p,q) coincides with the bilinear form in (!T5|) . Accordingly, for 
every z E the unique holomorphic extension of the parametric solution G{u{a + yip)) 
to complex parameters z = y + iC, is the unique variational solution of the parametric 
problem 

a{z-G{a + ziP),q) = U\q). Vg G K (41) 
Assumption UEAC(a^,™, a„„) is readily seen to imply 

Vp G y : Re(a(2;;p,p)) > a,,„||p||y . 
If we choose 5 G (k, 1) and choose z = y + irj, we obtain, for all ( and for \y\ < 5/ k 

Re{a{z-p,p))>aW^-6)\\p\\l. (42) 
From (HTj) we see that for such values oi z = y + iC, 

= a ^z; G{a + zip)^ q^ — a {^z; G{a + {z + h)ip), q^ 

+ a (^z; G(a + {z + h)ip), — a(^z + h; G{a + {z + h)ip), 
= a(^z;G(a + zip) — G{a + {z + h)ilj),q^ 

hipWGia + {z + h)ip)Vqdx. 

D 

Dividing by h we obtain that r satisfies, for all z = y + i( with \y\ < S/n and every 

CgM 

\/qeV: a{z;r,q)+ / ipVG{a + {z + h)ip)Wqdx = . (43) 

Jd 

The second term we denote by s{h) and note that, by Lemma [3.31 

\s{h)-s{h2)\<^\mU\f\\i\\q\\v\h-h2\. 

MIN 

If we denote the solution r to equation (143|) by r/j(a; z) then we deduce from the Lipschitz 
continuity of s(-) that rhia; z) — > ro(a; z) where 

a{z;ro,q) = s{0), Vg G V. 

Hence tq = dzGia + zip) G and we deduce that G : [—1, 1] — J- can be 
extended to a complex-valued function which is strongly holomorphic on the strip 
{y + iC: \y\<5/K, CgM}. 

We next study the domain of holomorphy of the analytic continuation of the 
potential $(a -|- zip-^ d) to parameters 2; G C. It suffices to consider K = 1 noting 
that then the unique analytic continuation of the potential $ is given by 

<l>(a + zip- 5) = ^(5- g{a + zip)') ^ (^5 _ ^(a + zip)') . (44) 
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The function z H- Q{a + zip) is holomorphic with the same domain of holomorphy as 
G{a + zip). Similarly it follows that the function 

T 



[5 -g{a + ziP)^ (5 -g{a + zip)) 



is holomorphic, with the same domain of holomorphy; this shown by composing the 
relevant power series expansion. From this we deduce that 6 and \1/ are holomorphic, 
with the same domain of holomorphy. □ 

So far we have considered the case J = {1} . We now generalize. To this end, we 
pick an arbitrary m G J and write y = {y*, ym) and z = [z*, Zm) ■ 

Assumption 4.6. There are constants < < a^^x < 00 and n G (0, 1) such that 
< Omin <a< a,,,„ < 00, a.e. X e D, ||V'illL°°{z)) 
For m G J, we write f|T9|) in the form 



< Ka„™ . (45) 



u{x; y) = a{x) + ymipm{x) + ^ yjip, 

ja\{m} 



From Assumption 14.61 we deduce that there are numbers kj < n such that 
Hence we obtain, for every x E D and every y eU the lower bound 

y) > a,™ (^1- (k- Km) - Krn^ 



> a[.Jl 



k' 



mJ 

-1 



with a'„^. = aMm(l — k) and k'^ = k^^I — — f^m)^ ^ (0, 1) . With this observation 
we obtain 



Lemma 4.7. Let Assumption\4-6] hold and set U = [—1, 1]^ and (p = G :U V . Then 



the functions ^ : U and O : — ?■ M, as well as the potential : U — M 

admit unique extensions to strongly holomorphic functions on the product of strips given 
by 

■■= + ■ \y,\ < 5,1 k], z, G M} (46) 
for any sequence p = {pj)jei '^^'th pj G (/t^-, 1). 

Proof. Fixing y*, we view \1/ and 6 as functions of the single parameter y^n- For each 
fixed y*, we extend ym to a complex variable Zm- The estimates preceding the statement 
of this lemma, together with Lemma 14. 5[ show that and O are holomorphic in the 
strip {ym + iZm ■ \ym\ < ^m/i^'m} any 5,n G (k^, 1). Hartogs' theorem [TB] and 
the fact that in separable Banach spaces (such as V) weak holomorphy equals strong 
holomorphy extends this result onto the product of strips, S. □ 
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We note that the strip Sp C C-" defined in fl46p contains in particular the polydisc 
Up with (pj)jgj where pj = Sj/k'^. 

4.4- Holomorphy and bounds on the posterior density 

So far, we have shown that the responses G{u), Q{u) and the potentials 5) depend 
holomorphically on the coordinates z ^ Ar C in the parametric representation 
u = a + X]jeJi ^J'^J- Now we deduce bounds on the analytic continuation of the posterior 
density Q{z) in ([8]) as a function of the parameters z on the domains of holomorphy. 
We have 

Theorem 4.8. Under UEAC(aM,N, a^Ax) for the analytic continuation Q{z) of the 
posterior density to the domains At of holomorphy defined in ( I3T1) . i.e. for 

Q{z) = exp 5)U=a+E,^|..^.) (47) 

there holds for every < r < a„„ 



2 



K 



sup \e{z)\ = sup |exp(-<l>(n(^);5)| <exp MJ^Yiy^WokWl-, . (48) 

These analyticity properties, and resulting bounds, can be extended to functions 
0(-) as defined by (1131) . using Lemma 14.71 and Theorem 14. 8 [ This gives the following 
result. 

Corollary 4.9. Under UEAC(aj,TO, ^max); for any m G N the functionals (f){u) = 
p^"^^ E S = l/^™) the posterior densities "^(z) = Q{z)(l){u{z)) defined in fITT]) admit 
analytic continuations as strongly holornorphic, V^"^^ -valued functions with domains 
Ar of holomorphy defined in ( 13T]) . Moreoever, for these functionals the analytic 
continuations of ^ in ( ITT]) admit the bounds 

m /II /||2 ^ 

\J \\\_ 

k=l 



sup ||e(.)(p(.))(-)||,.(.) < ^exp I ^5^||o,||^, ) . (49) 



5. Polynomial Chaos Approximations of the Posterior 

Building on the results of the previous section, we now proceed to approximate 
0(2;), viewed as a holomorphic functional over 2 G C"^, by so-called polynomial chaos 
representations. Exactly the same results on analyticity and on iV-term approximation 
of "^{z) hold. We omit details for reasons of brevity of exposition and confine ourselves 
to establishing rates of convergence of A^-term truncated representations of the posterior 
density G. The results in the present section are, in one sense, sparsity results on the 
posterior density O. On the other hand, such A^-term truncated gpc representations 
of are, as we will show in the next section, computationally accessible once sparse 
truncated adaptive forward solvers of the parametrized system of interest are available. 
Such solvers are indeed available (see, e.g., [3l[5l|22] and the references therein), so that 
the abstract approximation results in the present section have a substantive constructive 
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aspect. Algorithms based on Smolyak-type quadratures in U which are designed based 
on the present theoretical results will be developed and analyzed in [1]. In this section 
we analyze the convergence rate of N-term truncated Legendre gpc- approximations ofQ 
and, with the aim of a constructive N-term approximation of the posterior Q{y) in U 
in Section [6] ahead, we analyze also iV-term truncated monomial gpc- approximations of 

e(2/). 

5.1. gpc Representations of Q 

With the index set J from the parametrization ( |T9l) of the input, we associate the 
countable index set 

T = {i^ e NI : < oo} (50) 

of multiindices where No = N U {0}. We remark that sequences z/ G J-" are finitely 
supported even for J = N. For z/ G J-", we denote by Ij, = {j G N : z/j 7^ 0} C N the 
"support" of G J-", i.e. the finite set of indices of entries of G J-" which are non-zero, 
and by K(z/) := < oo, u E J-' the "support size" of i.e. the cardinality of I,,. 

For the deterministic approximation of the posterior density Q{y) in (IHl) we shall 
use tensorized polynomial bases similar to what is done in so-called "polynomial chaos" 
expansions of random fields. We shall consider two particular polynomial bases, 
Legendre and monomial bases. 

5.1.1. Legendre Expansions of Q Since we assumed that the prior measure fio{dy) is 
built by tensorization of the uniform probability measures on (—1, 1), we build the bases 
by tensorization as follows: let Lk{zj) denote the A;*'* Legendre polynomial of the variable 
Zj E C, normalized such that 

j {Lk{t)f'^ = l, fc = 0,1,2,... (51) 

Note that Lq = 1. The Legendre polynomials Lk in fl5T]) are extended to tensorproduct 
polynomials on U via 

L,{z) = Y[U^{zj), zeC\ ueJ^ . (52) 
ieJJ 

The normalization fl5T|) implies that the polynomials L^{z) in fl52l) are well-defined for 
any z E since the finite support of each element of z/ G J-" implies that L^, in ( l52ll is 
the product of only finitely many nontrivial polynomials. It moreover implies that the 
set of tensorized Legendre polynomials 

P(f/,/io(rfy)):={L, :z.G^} (53) 

forms a countable orthonormal basis in L'^{U, fio{dy)). This observation suggests, by 
virtue of Lemma 15.11 below, the use of mean square convergent gpc-expansions to 
represent and \E'. Such expansions can also serve as a basis for sampling of these 
quantities with draws that are equidistributed with respect to the prior fj,Q. 
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Lemma 5.1. The density 6 : t/ — t- M square integrable with respect to the prior 
fioidy) overU, i.e. O G L'^{U, fiQ{dy)) . Moreover, if the functional : U ^ S in ffTTl) 
is hounded, then 

/ \\^{y)\\l^^o{dy) < oo, 
Ju 

I.e. ^' e L'^{U, Hoidy); S). 



Proof. Since $ is positive it follows that Q{y) G [0, 1] for all ?/ G f/ and the first 
result follows because fiQ is a probability measure. Now define K = sup^g^ \4>{y)\- Then 
supygc/ ||\I^(?/)||s' < K and the second result follows similarly, again using that /iq is a 
probability measure. □ 

Remark 5.2. It is a consequence of (fT7|l that in the case where = G{u) = p E V 
we have < ||/||y*/^MiN for all y E U. Thus the second assertion of Lemma [5H 

holds for calculation of the expectation of the pressure under the posterior distribution 
on u. Indeed the assertion holds for all moments of the pressure, the concrete examples 
which we concentrate on here. □ 

Since F{U, po{dy)) in ( l53l) is a countable orthonormal basis of L'^{U, fio{dy)), the 
density Q{y) of the posterior measure given data 5 G F, and the posterior reweighted 
pressure ^^(y) can be represented in L'^{U, fio{dy)) by (parametric and deterministic) 
generalized Legendre polynomial chaos expansions. We start by considering the scalar 
valued function Q{y). 

Q{y) = Y,e,LM in L'iU.pidy)) (54) 

where the gpc expansion coefficients O^, are defined by 

e,= f Q{y)LMM) , ue:F . (55) 
Ju 

By Parseval's equation and the normalization (ISTl) . it follows immediately from (!54|) and 
Lemma 15.11 with Parseval's equality that the second moment of the posterior density 
with respect to the prior 

is finite. 



5.1.2. Monomial Expansions ofQ We next consider expansions of the posterior density 
G with respect to monomials 

Once more, the infinite product is well-defined since, for every i/ G J-", it contains only 
K(z/) many nontrivial factors. By Lemma 14.71 and Theorem 14.81 the posterior density 
Q{y) admits an analytic continuation to the product of strips Sp which contains, in 
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particular, the polydisc Up. In f/, 6(?/) can therefore be represented by a monomial 
expansion with uniquely determined coefficients t^^ eV which coincide, by uniqueness 
of the analytic continuation, with the Taylor coefficients of 6 at G ?7: 

Vy e f/ : Q{y) = J2 ^^y" > ■■= ^d;eiy) \y=o • (57) 
5.2. Best N-term Approximations of Q 

In our deterministic parametric approach to Bayesian estimation, evaluation of 
expectations under the posterior requires evaluation of the integrals ffTOl) and f|T2|) . Our 
strategy is to approximate these integrals by truncating the spectral respresentation 
(15^ . as well as a similar expression for to a finite number of significant terms, 

and to estimate the error incurred by doing so. It is instructive to compare with Monte 
Carlo methods. Under the conditions of Lemma 15.11 posterior expectation of functions 
^ have finite second moments so that Monte Carlo methods exhibit the convergence rate 
A^~^/^ in terms of the number of samples, with similar extension to MCMC methods. 
Here, however, we will show that it is possible to derive approximations which incur 
error decaying more quickly that the square root of A^, where N is now the number of 
significant terms retained in fl5^ . 

By fl56|l . the coefficient sequence {0^)^^jr must necessarily decay. If this decay is 
sufficiently strong, possibly high convergence rates of A^-term approximations of the 
integrals (flOil . (fT2|) occur. The following classical result from approximation theory 
[9] makes these heuristic considerations precise: denote by (7„,)neN a (generally not 
unique) decreasing rearrangement of the sequence {\Ot,\)u£T. Then, for any summability 
exponents < cr < g < oo and for any A^ G N holds 

\n>N J \n>l J 

5.2.1. L^(f/;/io) Approximation. Denote by Ajy C J-" a set of indices v E 
corresponding to A^ largest gpc coefficients \6y\ in (15^ . and denote by 

QaAv) ■■= E ^-^^-^(^Z) (59) 

the Legendre expansion truncated to this set of indices. Using (1351) with q = 2, 
Paseval's equation fl56l) and < o" < 1 we obtain for all A^ 

\\Q{z) - QAA^niHuM) < iV"^||(^.)lk^(^), ^■■=1-1- (60) 

We infer from fl60|) that a mean-square convergence rate s > 1/2 of the approximate 
posterior density Qaj^ can be achieved provided that (6,^) G i^{J^) for some < a < 1. 
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5.2.2. L^{U;fio) and pointwise Approximation of Q The analyticity of Q{y) in Up 
implies that Q{y) can be represented by the Taylor exansion f l57|) . This expansion is 
unconditionally summable in U and, for any sequence {ATvjAreN C J-" which exhausts J-" 
[§|, the corresponding sequence of A^-term truncated partial Taylor sums 

TaAv) ■■= E (61) 

converges pointwise in [/ to 6. Since for y G f/ and u E T we have \y'^\ < 1, for any 
Atv C J-" of cardinality not exceeding holds 



sup|e(2/) -Ta^(?/)| = sup 



< 



E 



(62) 



Similarly, we have 



^N\\L^{U,fj.o) 



E -"f' 

ueT\A]s, 



For z/ G J-", we calculate 



1^ IIli^w) 



|y''|/io(rf?/) 



! 



so that we find 



I© ^AivllLi(C/,w) - 



. (z/ + l)! 



(63) 



5.2.3. Summary There are, hence, two main issues to be addressed to employ the 
preceding approximations in practice: i) establishing the summability of the coefficient 
sequences in the series flM|) . fl57|) : and ii) finding algorithms which locate sets A^v C J-" of 
cardinality not exceeding A^ for which the truncated partial sums preserve the optimal 
convergence rates and, once these sets are localized, to determine the A^ "active" 
coefficients O^, or r^^, preferably in close to 0{N) operations. In the remainder of this 
section, we address i) and consider ii) in the next section. 



5. 3. Sparsity of the posterior density O 

The analysis in the previous section shows that the convergence rate of the truncated 
gpc-type approximations ( l59i) . ( 1611) on the parameter space U is determined by 
the cr- summability of the corresponding coefficient sequences {\6v\)veT, {\'^u\)ueF ■ 
We now show that summability (and, hence, sparsity) of Legendre and Taylor 
coefficient sequences in the expansions (15^ . (1571) is determined by that of the sequence 
(||^j||L°°(D))jeN in the input's fluctuation expansion f|T9|) . Throughout, Assumptions 13.11 
and 13.21 will be required to hold. We formalize the decay of the in (jl]) by 

§ We recall that a sequence {AArjArgN C J" of index sets A^r whose cardinality docs not exceed N 
exhausts F if any finite A C is contained in all Km for N > Nq with iVo sufficiently large. 
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Assumption 5.3. There exists < a < 1 such that for the parametric representations 
f fT9|) . (g]) it holds that 

oo 

^||^,||Io.(^)<oo. (64) 

The strategy of establishing sparsity of the sequences (|6'^|)y6jr, (|r^|)i/gjF is based 
on estimating the sequences by Cauchy's integral formula applied to the analytic 
continuation of O. 



5.3.1. Complex extension of the parametric problem To estimate \9u\ in (l59l) . we 
shall use the holomorphy of solution to the (analytic continuation of the) parametric 
deterministic problem: let < Ji" < 1 be a constant such that 

oo 

KY,mL-,D)<^-^. (65) 

j=l 

Such a constant exists by Assumption [331 For K selected in this fashion, we next choose 
an integer Jq such that 

3>Jo ^ ' 

Let E = {1, 2, . . . , Jo} and F = N \ E. We define 



j>Jo 

For each u E we define a i^-dependent radius vector r = (r.m)mGj with > for all 
m G J as follows: 

rm = K when m < Jq and = 1 + — i*^,™^" when m > Jq, (66) 

'iWF\\\Vm\\L°-{D) 



where we make the convention that = if jz/i;'! = 0. We consider the open discs 



uuiivciitiuii mat 

Um C C defined by 

[-1, 1] C Um ■■= {zm^C: < 1 + r„,} C C. (67) 



We will extend the parametric deterministic problem ( 15^ to parameter vectors z in the 
polydiscs 

Wi+r :=(g)W,„cCl (68) 

To do so, we invoke the analytic continuation of the parametric, deterministic coefficient 
function u{x, y) in (1191) to z eU which is for such z formally given by 

u{x, z) = a{x) + E^ ipn 



■'m V-^J Zm 



meJI 
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We verify that this expression is meaningful for z ^Uj.: we have, for almost every x G -D, 



mej 

Jo 



< ess sup \a{x)\ + ||^m||L'»(D)(l + K) 
x£D — ; 

m=l 



m>Jo 



411^^1 ||V^m||L°°(Z)) 



m||L°°(D) 



< ||a.||L°°{D) + 2 ^ ||'?/'m||L°°{D) 



m=l 



5.3.2. Estimates of the 9^, 

Proposition 5.4. There exists a constant C > such that, with the constant K G (0, 1) 
in f l65p . for every z/ G J-" i/ie following estimate holds 

£ c( n ^^"^ ■-)• 

mell{!^) 

where 77^ := Tm. + + r,^ wzi/i as m / fg^) . 

Proof For z/ G J-", define 6'j/ by ( l55l) let S* = I(z^) and define S = J\S. For S* denote 
by Us = ®rn&sUm and U§ = ®^^gUm, and by Us = {Vi '■ i ^ S} the extraction from 
y. Let be the ellipse in lAm with foci at ±1 and semiaxis sum r/^ > 1- Denote also 
= Ylm&iu) ^m- We can then write ( l55l) as 

{27iiy\o J^^ [zs - ysr 

For each m G N, let F^ be a copy of [—1, 1] and G F^. We denote by Us = Ylmes ■'^m 
and Us = Ylmes^^- then have 

= / / 0(2^5,1/5) / ^''^^\ dps{ys)dzsdps{ys)- 

To proceed further, we recall the definitions of the Legendre functions of the second 
kind 

Quiz) = I ^^My). 

J[~i,i] [z - y) 

Let vs be the restriction of to S*. We define 

Under the Joukovski transformation Zm = ^{wm + the Legendre polynomials of 

the second kind take the form 



00 
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with \qumk\ ^ TT. Therefore 



oo 



m. 



We then have 

1^. 



< 



1 



(27r)l'^l" 



^ 0(2:5, 1/5) 2;^s(^5)c?^5C?Ps(y£ 
JSs 

f \Q{zs,ys)\Qus{zs)dzsdps{ys) 

Us JSs 

Tu^l|0(^)ilL-(fsxC/s)max|Q^^| JJ Len(^^) 
-||e(z)|U.(,,.^,) n ^f^l^en{£^) 



m65 " 

as Len(£^m) < 4?]^, ?7m > 1 + and as 1 0(^)1 is uniformly bounded on £s x by 
Theorem 14.81 □ 

5.3.3. Summahility of the 6„ To show the £'^{J^) summabihty of 16*1^1, we use the 
following result, which appears as Theorem 7.2 in [6]. 

Proposition 5.5. For < cr < 1 and for any sequence {hy)y^jr, 

(^-hA G r(^) ^ V |6„,| < 1 and {h„,)„,^^et{n) . 

m>l 

This result implies the cr-summability of the sequence {9^) of Legendre coefficients. 
Proposition 5.6. Under Assumptions \3. 1\ \3.2\ for < a < 1 as in Assumption \5.3\. 
Y^ueT l^'^r ^5 fimte. 

Proof We have from Proposition 15.41 that 

<C( H ^il±^r/^'")( Yl 2(l + ir) / 4|^^|||V;^|U^p) 

where r] = 1/(1 + K) < 1 . Let J'i? = {z/ G J" : I(i^) C E} and = J^\E. From 
this, we have 

where 



crUm 
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and 

^ -j-j- / 2(1 + / 4|t^|||V;^||^oo(j) y^m 



We estimate A^; and Ap: for A^;, we have 

Jo 



m>l 

which is finite due to rj < 1. For Ai;', we note that for z/^ 7^ 0, 

)' 



2{1 + K) ^ f2{l + K)Y" 



K - \ K 



Therefore 



where 



_ 8(i + fr)||^^||L^(g) 

-'^ '^min 




With the convention that = 1 we obtain from the Stirhng estimate 

< n" < 



that Iz/l'*^' < |z/|!e''''. Inserting this in the above bound for Ap, we obtain 



-n^.i.max{l,ev/^} 

Hence 

where dm = edm and where we used the estimate e^/n < e". From this, we have 

24fl + K) 



m>l m£F 



Since also 

we obtain with Proposition 15.51 the conclusion. □ 
We now show a-summability of the Taylor coefficients t^, in ( 1571) . To this end, we 
proceed as in the Legendre case: first we establish sharp bounds on the t^, by complex 
variable methods, and then show a-summability of {T,y)„(zjr by a sequence factorization 
argument. 
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5.3.4- Bounds on the Taylor coefficients Ty 

Lemma 5.7. Assume UEAC(a„,N; ^max) (in-d that p = {pj)j>i is an r-admissible sequence 
of disc radii for some < r < a^iN. Then the Taylor coefficients of the parametric 
posterior density (1571) satisfy 

\ k=l / j>l 

Proof For u = {iyj)j>i G T holds J = max{j E N : uj ^ 0} < oo. For this J, define 
6[j](2'^) := Q{zi, Z2, zj,0, ...), i.e. Q[j]{z-^) denotes the function of z'^ G C"' obtained 
by setting in the posterior density Q{z) all coordinates Zj with j > J equal to zero. 
Then 

Since the sequence p is r-admissible it follows with ( 148|) that 

sup |e[,](^i,...,^,)|<exp(^%f;i|o,f^J . (71) 
(2i,...,2j)eWp,j \ fc^i / 

for all (zi, . . . , zj) in the polydisc Upj := ®i<j<j{zj G C : \zj\ < pj} C C"'. We now 
prove ( ITOl) by Cauchy's integral formula. To this end, we define p by 

r 

Pj ■= Pj + e if j < J, pj = pj a j > J, e : 



2|| Ej<J \^j\\\L°-{D) 

Then the sequence p is r/2-admissible and therefore Up C v4.r/2- This implies that for 
each z G Up, u is holomorphic in each variable Zj. 

It follows that uj is holomorphic in each variable zi,...,zj on the polydisc 
®i<i<^{kil < Pj} which is an open neighbourhood of Up^j in C"^. 

We may thus apply the Cauchy formula (e.g. Theorem 2.1.2 of [13]) in each variable 



Zf 

( \ ■\'J f f Uj{Zi,...,Zj) 

uj{zi, . . . , Zj) = {2711} / .../ — —dzi...dzj. 

J\zi\=pi J\zj\=pj [Zl - Zl) ... [Zj - Zj) 

We infer 

. .,^^«X0, . . . , 0) = u\{2m)-' f ... f ""'[ll'-'-jf/^ d-zi ...d~zj. 



Bounding the integrand on = pi} x ... x {\zj\ = pj} C Ar with (148|) implies ( !70|) . 

□ 

5.3.5. a-summability of the Proceeding in a similar fashion as in Section 3 of [7], 
we can prove the cr-summability of the Taylor coefficients t^,. 

Proposition 5.8. Under AssumptionslMilEE andlSiM {\\tA\v) G i^i^T) forO <a <1 
as in Assumption \5.3[ 
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We remark that under the same assumptions, we also have a-summabihty of 
{ru/{v+ l)!)i^e^> since 

5.4- Best N-term convergence rates 

With (|58|) . we infer from Proposition l5.6l and from fl6Ql) convergence rates for "polynomial 
chaos" type approximations of the posterior density B. 

Theorem 5.9. If Assumptions \3J\ \3.2\ and \5.3\ hold then there is a sequence (A7v)7veN C 
J-" of index sets with cardinality not exceeding N (depending a and on the data 6) such 
that the corresponding N-term truncated gpc Legendre expansions 0Ajv f l59|l satisfy 

lie - QAjLHU,,oidy)) < iV-^^-^^||(^.)|k^(^;R) • (72) 

Likewise, for g = 1, oo and for every N E N, there exist sequences (A7v)Ar6N ^ of 
index sets (depending, in general, on a , q and the data) whose cardinality does not exceed 
N such that the N-term truncated Taylor sums (16T]) converge with rate 1/a — 1, i.e. 

\\Q -TA^\\L'i(U,^o{dy)) < ^^~^^~^^II(^/.)II^-(J-;]R) • (73) 

Here, for q = oo the norm \\ o \\L°°{U;fio) the supremum over all y eU . 



6. Approximation of Expectations under the Posterior 

Recall that in our approach to Bayesian estimation, the expectations under the posterior 
given data 5 are rations of deterministic, infinite dimensional parametric integrals Z' and 
Z with respect to the prior measure /io? given by ( fTOl) and ( fT2l) . For our specific elliptic 
inverse problem these reduce to iterated integrals over the coordinates yj G [—1,1] 
against a countable product of the uniform probability measures \dyj. To render this 
practically feasible, numerical evaluation of integrals of the form 

W) = I <PH: y)My)Mdy) e S (74) 

Jyeu 

are required for functions (p : U — )■ 5, for a suitable state space S. Note that the choice 
0=1 gives Z. For not identically 1, the integral (1741) gives the (posterior) conditional 
expectation E^^ [0(w)] if normalized by Z. 

For the elUiptic inverse problems studied here, the choices of (f){u) = u given by 
(fT3|) with G{u) = p are of particular interest. For p = 1 this gives rise to the need to 
evaluate the integrals 

P'= [ p{;yMy)Mdy)eV (75) 

Jyeu 

which, when normalized by Z, gives the (posterior) conditioned expectation E^i[p]. We 
study how to approximate this integral. With the techniques developed here, and with 
Corollary 14. 9[ analogous results can also be established for expectations of m point 
correlations of G{u) as in (IT^ . using fl74l) . and the normalization constant Z. 
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Our objective is to find constructive algorithms which achieve the high rates of 
convergence, in terms of number of retained terms in a gpc expansion, imphed by 
the theory of the previous section, and offering the potential of beating the complexity 
of Monte Carlo based methods. The first option to do so is to employ sparse tensor 
numerical integration scheme over U tailored to the regularity afforded by the analytic 
parameter dependence of the posteriori density on y and of the integrands in ([71]) . This 
approach is not considered here, but is considered elsewhere: we refer to [1] for details 
and numerical experiments. Here we adopt an approach based on showing that the 
integrals (174|) allow semianalytic evaluation in log-lineai{Jj] complexity with respect to 
A^, the number of "active" terms in a truncated polynomial chaos expansion of the 
parametric solution of the forward problem (jl]). 

To this end, we proceed as follows: based on the assumption that A^-term gpc 
approximations of the parametric forward solutions y) of ( lT4l) is available, for 
example by the algorithms in [3l [TOl |5], we show that it is possible to construct 
separable N-term approximations of the integrands in (TMll . The existence of such an 
approximate posterior density which is "close" to 6 is ensured by Theorem 15.91 provided 
the (unknown) input data u satisfies certain conditions. We prove that sets Ajv C J-" 
of cardinality at most N which afford the truncation errors (172|) . (1731) can be found 
in log-linear complexity with respect to N and, second, that the integrals (1741) with the 
corresponding approximate posterior density can be evaluated in such complexity and, 
third, we estimate the errors in the resulting conditional expectations. 

6.1. Assumptions and Notation 

Assumption 6.1. Given a draw u of the data, an exact forward solution p of the 
governing equation (fT4|) for this draw of data u is available at unit cost. 

This assumption is made in order to simplify the exposition. All conclusions remain 
valid if this assumption is relaxed to include an additional Finite Element discretization 
error; we refer to [1] for details. We shall use the notion of monotone sets of multiindices. 

Defintion 6.2. A subset Ajv C T of finite cardinality N is called monotone if (Ml) 
{0} C Ajv and if (M2) VO 7^ z/ G A^v it holds that v — Cj E Aj^ for all j E I„, where 
Cj G {0, 1}^ denotes the index vector with 1 in position j G J and in all other positions 
ZG 

Note that for monotone index sets A^r C J-" properties (Ml) and (M2) in Definition 
16.21 imply 

PA^(t/) = span{y'' : v G A^} = span{L^ : v G A;v} • (76) 

Next, we will assume that a stochastic Galerkin approximation of the entire forward map 
of the parametric, deterministic solution with certain optimality properties is available. 

II Meaning linear multiplied by a logartihmic factor. 
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Assumption 6.3. Given a parametric representation 0191] of the unknown data u, a 
stochastic Galerkin approximation p^- G F\j^{U, V) of the exact forward solution of the 
governing equation ( fT4l) is available at unit cost. Here the set An C T is a finite subset 
of "active" gpc Legendre coefficients whose cardinality does not exceed N. In addition, 
we assume that the gpc approximation pn G F\^{U,V) is quasi optimal in terms of the 
best N-term approximation, i.e. there exists C > 1 independent of N such that 



Here < a < 1 denotes the summability exponent in Assumption 15.31 Note that best 
N-term approximations satisfy (1771) with C = 1; we may refer to (!77|) as a quasi best 
N-term approximation property. 

This best A^-term convergence rate of stocliastic Galerkin Finite Element Method 
(sGFEM) approximations follows from results in [6l [7] , but these results do not indicate 
as to how sequences of sGFEM approximations which converge with this rate are actually 
constructed. We refer to |T0] for the constructive algorithms for quasi best A^-term 
Legendre Galerkin approximations and to [5] for constructive algorithms for quasi best 
A^-term Taylor approximations and also to the references there for details on further 
details for such sGFEM solvers, including space discretization. In what follows, we 
work under Assumptions 16.11 16.31 

6.2. Best N-term based approximate conditional expectation 

We first address the rates that can be achieved by the (a-priori not accesssible) best 
A-term approximations of the posterior density G in Theorem 15.91 These rates serve as 
benchmark rates to be achieved by any constructive procedure. 

To derive these rates, we let Qn = ©Aat denote the best A-term Legendre 
approximations of the posterior density in Theorem 15.91 With fl77|) . we estimate 



With T/v = T\j^ denoting a best A-term Taylor approximation of G in Theorem 15.91 we 



p-PN\\LHU,,o;V)<CN^^'/'^^'/'^\m\U.^^) . 



(77) 
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obtain in the same fashion the bound 



\\P'-Pn\ 



{Qp - TnPn) fJ'oidy) 



((6 - Tn)p + Tn{p- Pn)) fJ'oidy) 



< 



V 



|e - TNlWpWvfJ'oidy) + \\TN\\L°-iU)\\P - PN\\LHU,fio;V) 



< \\Q -TN\\Li{U,iio) 



'(U,fj.o]V) + \\Tn\\l°°(U)\\P — PN\\L'2{U,fj.o;V) 



We now address question ii) raised at the beginning of Section 15. 2^ i.e. the design 
of practical algorithms for the construction of sequences (AAr)jv6N C J-" such that the 
best-N term convergence rates asserted in Theorem \5.9i are attained. We develop the 
approximation in detail for (fTSj) : similar results for ( FM|) may be developed for various 
choices of (h. 



6.3. Constructive N-term Approximation of the Potential $ 

We show that, from the quasi best A^-term optimal stochastic Galerkin approximation 
G PAjv(f^5 V) and, in particular, from its (monotone) index set An, a corresponding 
A^-term approximation $7v of the potential $ in ([3]) can be computed. We denote 
the observation corresponding to the stochastic Galerkin approximation of the system 
response pn by Qn, i-e. the mapping 

U 3y^ ^7v(w)U=a+E,-sj%^, = (C ° GN){u)\u=a+Y.,^jy,^, (78) 

where Gn{u) = pn & ^An{U; V). By the linearity and boundedness of the observation 
functional C(-) then Gn G Fa^{U;R^); in the following, we assume for simplicity 
= 1 so that GN\u=a+Y,j^jyj'4>j ^ ^AniU)- then denote hj U 3 u ^ the 

potential in ([3]) and by $Ar the potential of the stochastic Galerkin approximation Qn 
of the forward observation map. For notational convenience, we suppress the explicit 
dependence on the data 6 in the following and assume that the Gaussian covariance 
r of the observational noise 77 in ([1]) is the identity: T = I. Then, for every y & U, 
with u = a + Ylij^jVi'^j exact potential $ and the potential $Ar based on TV-term 
approximation pat of the forward solution take the form 



^{y) = 7t(^ - ^^Niy) = -(5 - QN{u)f 



(79) 



By Lemma 14. 7[ these potentials admit extensions to holomorphic functions of the 
variables z E Sp m. the strip Sp defined in (l46i) . Since A at is monotone, we may write 
Pn G IPAjv(f^5 y) and Qn G "^hN^^) terms of their (uniquely defined) Taylor expansions 
about ?/ = 0: 

Gn{u) = J2 a.y" ■ (80) 
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This implies, for every ?/ G f/, $Ar(y) = 5^ — 2SQj\f{y) + {QN{y)Y where 

has a higher polynomial degree and possibly 0{N'^) coefficients. Therefore, an exact 
evaluation of a gpc approximation of the potential might incur loss of linear 
complexity with respect to A^. To preserve log-linear in complexity, we perform 
an iV-term truncation [$Ar]#Ar of $7V; thereby introducing an additional error which, as 
we show next, is of the same order as the error of gpc approximation of the system's 
response. The following Lemma is stated in slightly more general form than is presently 
needed, since it will also be used for the error analysis of the posterior density ahead. 

Lemma 6.4. Consider two sequences [g^] G 1"{J^), {g'yi) G 1"{J^'), < o" < 1. Then 
and there holds 

\\{9Mr,^^T.T') < \\{9u)t.^^)\\{g'Ml^^r') ■ (81) 
Moreover, a best N-term truncation [o]^ of products of corresponding best N-term 
truncated Taylor polynomials, defined by 




I u' 

9u<V 



■■= E a.gl'y-'^''' en.^iU) (82) 

where Ajy G x J^' is the set of sums of index pairs {u, u') E x T' of at most N 
largest (in absolute value) products gygyi, has a pointwise error in U bounded by 

iV'^'"'^ll(^?.)ll.^{.F)||(^7:OII.^(-F')- (83) 

Moreover, if the index sets An G T and C are each monotone, the index set 
An := {v + u' : (z/, z/') G Ajy} C J-" can he chosen monotone with cardinality at most 2N. 

Proof. We calculate 

\Kg'u'\\i-{TxT) = YYl \3'^9'u>V = Yl I l^^l" Y l^^'l" ) 
= lligum^i^^Ugl'm^i^y 

Since {g^g'^ G £°'(J-' x J^), we may apply f l58|) with f lST]) as follows. 



Y Y 



Y Y aX'V'^'^'^ 



J #N 



L°°{U) 



< Y i^?^^:'i<^"^'"'^ii(^^)ii^^(^)ii(^?:'')ii.^(7-) 
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Evidently, A^r ^ A^v + A^ and the cardinality of the set A^v + A'^ is at most 2N . If A 
and A^ are monotone, then A^v + A^ is monotone. To see it, let yU G A^ 



A^. Then 



/i 



u' for some u ^ Km, z/' G A', 



I.JV, and \ 



\u U Ii/'- Let 7^ /i, j G and assume 
w.l.o.g. that i G ly. Then /i — = {u — ej) + u' G Aat + A^ by the assumed monotonicity 
of the set Aat. If j G \u', the argument is analogous. Therefore /i — Cj G Aat + A'^y for 
every j G I^. Hence Aat + A^ C J-" is monotone. □ 



Lemma [6^ is key to the analysis of consistency errors in the approximate evaluation 
of TV-term truncated power series and, in particular, of the potential exp(— 
which appears in the posterior density 0. It crucially involves Taylor- type polynomial 
chaos expansions. Expansions based on Legendre (or other) univariate polynomial bases 
can be covered by Lemma (6.41 by conversion to monomial bases, using (176|) . as long as 
A^-term truncations are restricted to monotone index sets Ajy C J-". 

Applying Lemma 1^3] with T' = T and with {gl,)u'eF' = {gu)ueJ^, we find 



sup 

yeu 



<l>N{y) - [$iv(y)] 



#N 



sup 



{QN{y)f - [{QN{y)?] 



#7V 



< 



6.4- Constructive N-term approximation of Q = exp(— $) 

With the A^-term approximation [$Ar]^jV; we now define the constructive N-term 
approximation Qn of the posterior density. We continue to work under Assumption 16. 3[ 
i.e. that N -term truncated gpc- approximations pi\f of the forward solution p{y) = G{u{y)) 
of the parametric problem are available which satisfy (177j) . For an integer K{N) G N 
to be selected below, we define 



k=0 



5) 



We then estimate (all integrals are with respect to the prior measure fiQ{dy)) 



|e-ejvlUi 



['I'iv]#jv I 



Nl#N 



KiN) 

E 

fc=0 



KiN) 
fc=0 



k\ 



#N 



LHU) 



LHU) 



=:I + II. 

We estimate both terms separately. 

For term J, we observe that due to x = [$7v]#iv ~ > for sufficiently large values 
of A^, it holds < 1 — < x, so that by the triangle inequality and the bound ([Hi 



< 



< 



N\#N\\mu) 



\P ~ PN\\L2(^uy-j 
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where C depends on 5, but is independent of A^. In the preceding estimate, we used 
that $ > and < = exp(-$) < 1 imply 



1$ _ $ 



N\\l^{U) 



< \\0\\v*\\p-pN\\L2{Uy){'2\S\ + \\0\\v\\p + PN\\L2{U,V)) ■ 



We turn to term II. Using the (globally convergent) series expansion of the exponential 
function, we may estimate with the triangle inequality 

KiN) 



k=0 



L\U) 



where the remainder Rk{n) equals 



R 



■K[N) 



E 

k=K{N)+l 



k\ 



^6) 



(87) 



To estimate the second term in the bound fl86|) we claim that for every /c, G Mo holds 



#N 



L°°{U) 



<iv-(^"')||(^?. 



2ka 



We prove flHHl) for arbitrary, fixed G N by induction with respect to k. For k = 0,1, 
the bound is obvious. Assume now that the bound has been established for all powers 
up to some k >2. Writing {[^n]#n)''~^^ = {[^n]#n)''[^n]#n and denoting the sequence 
of Taylor coefficients of [^n]'^ by g^, with i/' G (-F x J^)'^ ~ J^^*^, we note that by /c-fold 
application of ([81]) it follows || (^i,/) ||^<.(^2fe) < II (^i.)!!^^^^)- By the definition of [^n]#n, 
the same bound also holds for the coefficients of {[^n]#n)'' , for every A; G N. We may 
therefore apply Lemma [63] to the product {[^n]#n)^[^n]#n and obtain the estimate 
([88]) with /c + 1 in place of k from ([83]) . Inserting ([88]) into ([86]) , we find 

K{N) K(N) 



N\#N) 



#N 



k=0 



\2k(7 



k=0 



2a ^ 



< A^-(^-i)exp(||(^?.)|| 

In a similar fashion, we estimate the remainder Rk{n) in ( 186]) : as the truncated Taylor 
expansion [$Ar]#Ar converges pointwise to $Ar and to $ > 0, for sufficiently large A^, 
we have [^n]#n > for all y E U , so that the series ( [87]) is alternating and converges 
pointwise. Hence its truncation error is bounded by the leading term of the tail sum: 



\R 



K{N)\\L^{U) 



< 



\[^n\#n\\loo^U) 



< 



(90) 



(A'(A^) + 1)! - (i^(Ar) + l)! 

Now, given A^ sufficiently large, we choose K{N) so that the bound ([90]) is smaller than 
which leads with Stirling's formula in (jUO]) to the requirement 



(i^ + l)ln ( ^ ) < InE - (- - l)lnAr 



K 



a 



(91) 



for some constants A,B>0 independent of K and A^ (depending on p and on {g,^))- 
One verifies that ([91]) is satisfied by selecting K[N) ~ InA^. 
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Therefore, under Assumptions 16.11 and 16.31 we have shown how to construct an 
A^-term approximate posterior density Qn by summing K = O(lnA^) many terms in 
( 185|) . The approximate posterior density has at most 0{N) nontrivial terms, which can 
be integrated exactly against the separable prior fiQ over U in complexity that behaves 
log-hnearly with respect to N, under Assumptions 16.11 16.31 the construction of Gat 
requires K-iold performance of the [-j^Ar-truncation operation in fl82|) of products of 
Taylor expansions, with each factor having at most A^ nontrivial entries, amounting 
altogether to solving (possibly approximately) O(A'A^lnA^) = 0(A^(lnA^)^) forward 
problems. 

Remark 6.5. Inspecting the (constructive) proof of Lemma [6.41 and the definition of the 
A^-term approximation Gat of the posterior density (l85l) . we see that the index set A® 
of active Taylor gpc coefficients of Gat satisfies 

A® C Af := (An + An) + ...(A:(A^) - times)... + (A^ + A^v) C J" 

where Aat C J-" is the set of A^ active gpc coefficients in the approximate forward solver 
in Assumption 16.31 

If, in particular, Aat is monotone, so is the set A®. This follows by induction over K 
with the argument in the last part of the proof of Lemma [^31 Moreover, the cardinality 
of A® is bounded by 2NK{N) < N\og{N). 

7. Conclusions 

This paper is concerned with formulation of Bayesian inversion as a problem in infinite 
dimensional parametric integration, and the construction of algorithms which exploit 
analyticity of the forward map from state space to data space to approximate these 
integration problems. In this section we make some concluding remarks about the 
implications of our analysis. We discuss computational complexity for such problems, 
and we discuss further directions for research. 

7.1. Computational Cost: Idealized Analysis 

Throughout we have been guided by the desire to create algorithms which outperform 
Monte Carlo based methods. To gain insight into this issue we first proceed under the 
(idealized) setting of Assumptions 16 . 1 1 and 16 . 3[ which imply that the PDE f lT^ . for fixed 
parameter u, and its parametric solution, for all u E U, can both be approximated at 
unit cost. In this situation we can study the cost per unit error of Monte Carlo and gpc 
methods as follows. We neglect logarithmic corrections for clarity of exposition. The 
Monte Carlo method will require 0{N) work to achieve an error of size A^~^, where 
A^ is a number of samples from the prior. To obtain error e thus requires work of 
order (9(e~^). Recall the parameter a from Assumption 15.31 which measures the rate of 
decay of the input fiuctations and, as we have shown, governs the smoothness properies 
of the analytic map from unknown to data. The gpc method based on best A^ term 
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approximation requires work which is hnear in to obtain an error of size A^~(i/'^~^). 
Thus to obtain error e requires work of order (9(e'^/^^~°"-*). For all a < 2/3 the complexity 
of the new gpc methods, under our idealized assumptions, is superior to that of Monte 
Carlo based methods. 

7.2. Computational Cost: Practical Issues 

The analysis of the previous subsection provides a clear way to understand the potential 
of the methods introduced in this paper and is useful for communicating the central idea. 
However, by working under the stated Assumptions 16.11 and 16. 3[ some aspects of the 
true computational complexity of the problem are hidden. In this subsection we briefly 
discuss further issues that arise. Throughout we assume that the desired form of the 
unknown diffusion coefficient for the forward PDE is given by (fT9l) in the case where 
J = N: 

u{x,y) = a{x) + ^yj'ipj{x), x e D. (92) 
To quantify the complexity of the problem we assume that, for some & > 0, 

(93) 

Then Assumption 15.31 holds for any cr > (1 + b)^^. In practice, to implement either 
Monte Carlo or gpc based methods it is necessary to truncate the series (l92l) to J terms 
to obtain 

u-^{x,y) = a{x) + J2 yj^i(^)^ ^^D. (94) 
i<i<J 

To quantify the computational cost of the problem we assume that the non-parametric 
forward problem ( |T4l) with fixed m G f/, incurs costs pde( J, e) to make an error of size 
e in V. Likewise we assume that the parametric forward problem (fT4l) . for all u & U, 
incurs costs ppde(A^, J, e) to make an error of e in L'^{U, fio{dy)] V) via computation of 
an approximation to a quasi-optimal best term gpc approximation. 

Both Monte Carlo based and gpc based methods will incur an error caused by 
truncation to J terms. Using the Lipschitz property of Q expressed in ( l26i) . together 
with the arguments developed in [sj^ we deduce that the error in computing expectations 
caused by truncation of the input data to J terms is proportional to 

oo 

ii^jIU-(d)- 

j=j+i 

% The key idea in [8] is that error in the forward problem transfers to error in the Bayesian inverse 
problem, as measured in the Hellinger metric and hence for a wide class of expectations; the analysis 
in [8] is devoted to Gaussian priors and situations where the Lipschitz constant of the forward model 
depends on the realization of the input data u and Fcrnique theorem is used to control this dependence; 
this is more complex than required here, because the Lipschitz constants in (|26p here do not depend on 
the realization of the input data u. For these reasons we do not feel it is necessary to provide a proof 
of the error incurred by truncation. 
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Under assumption fl93p this is of order 0{J~^) and since h may be chosen arbitrarily 
close to 1/cr — 1 we obtain an error 0{J^^^^'^) from truncation. 

The total error for Monte Carlo based methods using samples is then of the form 

In the case where C(J) is independent of J, which arises for pure Monte Carlo methods 
based on prior sampling and for the independence MCMC sampler [151 120], choosing 
and J to balance the error gives = (9(e~^) and J = 0(e~°'/(^~'^)) and, with these 
relationships imposed, the cost is A^ x pde( J, e) since one forward PDE solve is made 
at each step of any Monte Carlo method. In practice standard Monte Carlo sampling 
may be ineffective, because samples from the prior are not well-distributed with respect 
to the posterior density; this is especially true for problems with large numbers of 
observations and/or small observational noise. In this case MCMC methods may be 
favoured and it is possible that C{J) will grow with J; see [21] for an analysis of this 
effect for random walk Metropolis algorithms. Balancing the error terms will then lead 
to a further increase in computational cost. 

For gpc methods based on A^ term truncation the error is of the form 

implying that N = J = C)(e~°"/(i~'^)) to balance errors. This expressions must be 
substituted into ppde(A^, J, e) to deduce the asymptotic cost. 

In practice, however, the gpc methods can also suffer when the number of observed 
data is high, or when the observational noise is small. To see this, note that the choice 
of active terms in the expansion (!55|) is independent of the data, and is determined by 
the prior. For these reasons it may be computationally expedient in practice to study 
methods which marry MCMC and gpc [161 CZl [IB]- In a forthcoming paper [TT] we will 
investigate the performance of the gpc-based posterior approximations, in particular in 
the case of values of a which are close to a = 1, i.e. in the case of little or no sparsity 
in the expansion of the unknown u, for parametric precomputation of an approximation 
of the law of the forward model, removing the necessity to compute a forward solution 
at each step, and by extending this idea further to Multi-Level LMCMC. 

7.3. Outlook 

We have proved that for a class of inverse diffusion problems with unknown diffusion 
coefficient u, that in the context of a Bayesian approach to the solution of these 
inverse problems, given the data 6, for a class of diffusion coefficients u which are 
spatially heterogeneous and uncertainty parametrized by a countable number of random 
coordinate variables, sparsity in the gpc expansion of u entails the same sparsity in the 
density of the Bayesian posterior with respect to the prior measure. 

We have provided a constructive proof of how to obtain an approximate posterior 
density by an 0{N) term truncated gpc expansion, based on a set An C J-" ofN active gpc 
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coefficients in the parametric system's forward response. We have indicated that several 
algorithms for the linear complexity computation of approximate parametrizations 
including prediction of the sets Kn with quasi optimality properties (in the sense of 
best TV-term approximations) are now available. 

In [1] , based on the present work, we present a detailed analysis including the error 
incurred through Finite Element discretization of the forward problem in the physical 
domain D, under slightly stronger hypotheses on the data u and / than studied here. 
Implementing these methods, and comparing them with other methods such as those 
studied in [11], will provide further gudiance for the development of the promising ideas 
introduced in this paper, and variants on them. 

Furthermore, we have assumed in the present paper that the observation functional 
0{-) G V* which precludes, in space dimensions 2 and higher, point observations. Once 
again, results which are completely analogous to those in the present paper hold also 
for such albeit again under stronger hypotheses on u and on /. This will also be 
elaborated on in [1]. 

As indicated in [5], [6], [3, [22l [3l [10] the gpc parametrizations (by either Taylor- or 
Legendre type polynomial chaos representations) of the laws of these quantities allow 
a choice of discretization of each gpc coeffcient of the quantity of interest by sparse 
tensorization of hierarchic bases in the physical domain D and the gpc basis functions 
Lviy) resp. so that the additional discretization error incurred by the discretization 
in D can be kept of the order of the gpc truncation error with an overall computational 
complexity which does not exceed that of a single, deterministic solve of the forward 
problem. These issues will be addressed in [1] as well. 
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